16  Week 9: Reaction properties

Open In Colab

Reaction property prediction is a crucial task, as it enables us to better understand chemical reactions and their outcomes. This not only contributes to the development of new chemical compounds and materials but also helps in streamlining the reaction process and reducing the time and resources required for experimentation.

Two significant aspects of reaction property prediction are yield prediction and atom mapping. Yield prediction refers to forecasting the amount of product generated by a particular chemical reaction. Accurate yield prediction can help in optimizing reaction conditions, minimizing waste, and identifying the most efficient synthetic routes for a target molecule.

Atom mapping, on the other hand, is the process of determining the correspondence between atoms in the reactants and products of a chemical reaction. This information is essential for understanding the mechanism of the reaction and tracking the transformation of individual atoms during the reaction process. Atom mapping plays a vital role in various applications, including reaction database management, reaction classification, and the development of reaction templates for computer-aided synthesis planning.

Atom mapping

In this notebook, we will explore some applications of atom mapping as well as one of the tools that exist for the calculation of this property.

! pip install rdkit rdchiral reaction-utils
! mkdir data/
! curl -L https://www.dropbox.com/sh/6ideflxcakrak10/AADN-TNZnuGjvwZYiLk7zvwra/schneider50k -o data/uspto50k.zip
! unzip data/uspto50k.zip -d data/
!wget https://raw.githubusercontent.com/schwallergroup/ai4chem_course/main/notebooks/09%20-%20Reaction%20Properties/utils.py
Requirement already satisfied: rdkit in /home/andres/anaconda3/lib/python3.9/site-packages (2022.9.5)
Requirement already satisfied: rdchiral in /home/andres/anaconda3/lib/python3.9/site-packages (1.1.0)
Collecting reaction-utils
  Using cached reaction_utils-1.2.0-py3-none-any.whl (49 kB)
Requirement already satisfied: Pillow in /home/andres/anaconda3/lib/python3.9/site-packages (from rdkit) (9.2.0)
Requirement already satisfied: numpy in /home/andres/anaconda3/lib/python3.9/site-packages (from rdkit) (1.21.5)
Requirement already satisfied: Sphinx<6.0.0,>=5.0.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from reaction-utils) (5.0.2)
Collecting wrapt-timeout-decorator<2.0.0,>=1.3.12
  Using cached wrapt_timeout_decorator-1.3.12.2-py3-none-any.whl (21 kB)
Collecting PyYAML<6.0.0,>=5.4.1
  Downloading PyYAML-5.4.1-cp39-cp39-manylinux1_x86_64.whl (630 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 630.1/630.1 kB 11.6 MB/s eta 0:00:00a 0:00:01
Collecting swifter<2.0.0,>=1.0.9
  Using cached swifter-1.3.4.tar.gz (830 kB)
  Preparing metadata (setup.py) ... done
Collecting py7zr<0.19.0,>=0.18.7
  Using cached py7zr-0.18.12-py3-none-any.whl (65 kB)
Collecting metaflow<3.0.0,>=2.6.3
  Using cached metaflow-2.8.4-py2.py3-none-any.whl (923 kB)
Collecting xxhash<3.0.0,>=2.0.0
  Downloading xxhash-2.0.2-cp39-cp39-manylinux2010_x86_64.whl (243 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 243.3/243.3 kB 25.9 MB/s eta 0:00:00
Collecting Deprecated<2.0.0,>=1.2.13
  Using cached Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Requirement already satisfied: pandas<2.0.0,>=1.0.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from reaction-utils) (1.4.4)
Requirement already satisfied: wrapt<2,>=1.10 in /home/andres/anaconda3/lib/python3.9/site-packages (from Deprecated<2.0.0,>=1.2.13->reaction-utils) (1.14.1)
Requirement already satisfied: requests in /home/andres/anaconda3/lib/python3.9/site-packages (from metaflow<3.0.0,>=2.6.3->reaction-utils) (2.28.1)
Requirement already satisfied: boto3 in /home/andres/anaconda3/lib/python3.9/site-packages (from metaflow<3.0.0,>=2.6.3->reaction-utils) (1.24.28)
Requirement already satisfied: python-dateutil>=2.8.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from pandas<2.0.0,>=1.0.0->reaction-utils) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from pandas<2.0.0,>=1.0.0->reaction-utils) (2022.1)
Collecting pyzstd>=0.14.4
  Downloading pyzstd-0.15.7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (399 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 399.3/399.3 kB 37.6 MB/s eta 0:00:00
Collecting pybcj>=0.6.0
  Downloading pybcj-1.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (49 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.6/49.6 kB 5.5 MB/s eta 0:00:00
Collecting multivolumefile>=0.2.3
  Using cached multivolumefile-0.2.3-py3-none-any.whl (17 kB)
Collecting brotli>=1.0.9
  Downloading Brotli-1.0.9-cp39-cp39-manylinux1_x86_64.whl (357 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 357.2/357.2 kB 4.0 MB/s eta 0:00:00a 0:00:01
Collecting zipfile-deflate64>=0.2.0
  Downloading zipfile_deflate64-0.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (43 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.3/43.3 kB 4.6 MB/s eta 0:00:00
Requirement already satisfied: psutil in /home/andres/anaconda3/lib/python3.9/site-packages (from py7zr<0.19.0,>=0.18.7->reaction-utils) (5.9.0)
Collecting texttable
  Using cached texttable-1.6.7-py2.py3-none-any.whl (10 kB)
Collecting pyppmd<0.19.0,>=0.18.1
  Downloading pyppmd-0.18.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (138 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 138.7/138.7 kB 16.2 MB/s eta 0:00:00
Requirement already satisfied: pycryptodomex>=3.6.6 in /home/andres/anaconda3/lib/python3.9/site-packages (from py7zr<0.19.0,>=0.18.7->reaction-utils) (3.17)
Requirement already satisfied: sphinxcontrib-serializinghtml>=1.1.5 in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (1.1.5)
Requirement already satisfied: sphinxcontrib-jsmath in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (1.0.1)
Requirement already satisfied: babel>=1.3 in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (2.9.1)
Requirement already satisfied: Pygments>=2.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (2.14.0)
Requirement already satisfied: imagesize in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (1.4.1)
Requirement already satisfied: packaging in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (21.3)
Requirement already satisfied: docutils<0.19,>=0.14 in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (0.18.1)
Requirement already satisfied: sphinxcontrib-applehelp in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (1.0.2)
Requirement already satisfied: snowballstemmer>=1.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (2.2.0)
Requirement already satisfied: sphinxcontrib-devhelp in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (1.0.2)
Requirement already satisfied: alabaster<0.8,>=0.7 in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (0.7.12)
Requirement already satisfied: sphinxcontrib-qthelp in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (1.0.3)
Requirement already satisfied: sphinxcontrib-htmlhelp>=2.0.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (2.0.0)
Requirement already satisfied: importlib-metadata>=4.4 in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (4.11.3)
Requirement already satisfied: Jinja2>=2.3 in /home/andres/anaconda3/lib/python3.9/site-packages (from Sphinx<6.0.0,>=5.0.1->reaction-utils) (2.11.3)
Requirement already satisfied: dask[dataframe]>=2.10.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from swifter<2.0.0,>=1.0.9->reaction-utils) (2022.7.0)
Requirement already satisfied: tqdm>=4.33.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from swifter<2.0.0,>=1.0.9->reaction-utils) (4.64.1)
Requirement already satisfied: ipywidgets>=7.0.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from swifter<2.0.0,>=1.0.9->reaction-utils) (7.6.5)
Requirement already satisfied: cloudpickle>=0.2.2 in /home/andres/anaconda3/lib/python3.9/site-packages (from swifter<2.0.0,>=1.0.9->reaction-utils) (2.0.0)
Requirement already satisfied: parso>0.4.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from swifter<2.0.0,>=1.0.9->reaction-utils) (0.8.3)
Requirement already satisfied: bleach>=3.1.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from swifter<2.0.0,>=1.0.9->reaction-utils) (4.1.0)
Collecting cli-exit-tools
  Using cached cli_exit_tools-1.2.3.2-py3-none-any.whl (10 kB)
Collecting lib-detect-testenv
  Using cached lib_detect_testenv-2.0.3-py3-none-any.whl (9.0 kB)
Requirement already satisfied: dill in /home/andres/anaconda3/lib/python3.9/site-packages (from wrapt-timeout-decorator<2.0.0,>=1.3.12->reaction-utils) (0.3.4)
Collecting multiprocess
  Downloading multiprocess-0.70.14-py39-none-any.whl (132 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 132.9/132.9 kB 15.5 MB/s eta 0:00:00
Requirement already satisfied: webencodings in /home/andres/anaconda3/lib/python3.9/site-packages (from bleach>=3.1.1->swifter<2.0.0,>=1.0.9->reaction-utils) (0.5.1)
Requirement already satisfied: six>=1.9.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from bleach>=3.1.1->swifter<2.0.0,>=1.0.9->reaction-utils) (1.16.0)
Requirement already satisfied: fsspec>=0.6.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from dask[dataframe]>=2.10.0->swifter<2.0.0,>=1.0.9->reaction-utils) (2022.7.1)
Requirement already satisfied: partd>=0.3.10 in /home/andres/anaconda3/lib/python3.9/site-packages (from dask[dataframe]>=2.10.0->swifter<2.0.0,>=1.0.9->reaction-utils) (1.2.0)
Requirement already satisfied: toolz>=0.8.2 in /home/andres/anaconda3/lib/python3.9/site-packages (from dask[dataframe]>=2.10.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.11.2)
Requirement already satisfied: zipp>=0.5 in /home/andres/anaconda3/lib/python3.9/site-packages (from importlib-metadata>=4.4->Sphinx<6.0.0,>=5.0.1->reaction-utils) (3.8.0)
Requirement already satisfied: nbformat>=4.2.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (5.5.0)
Requirement already satisfied: widgetsnbextension~=3.5.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (3.5.2)
Requirement already satisfied: traitlets>=4.3.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (5.1.1)
Requirement already satisfied: ipython-genutils~=0.2.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.2.0)
Requirement already satisfied: jupyterlab-widgets>=1.0.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (1.0.0)
Requirement already satisfied: ipython>=4.0.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (7.31.1)
Requirement already satisfied: ipykernel>=4.5.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (6.15.2)
Requirement already satisfied: MarkupSafe>=0.23 in /home/andres/anaconda3/lib/python3.9/site-packages (from Jinja2>=2.3->Sphinx<6.0.0,>=5.0.1->reaction-utils) (2.0.1)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /home/andres/anaconda3/lib/python3.9/site-packages (from packaging->Sphinx<6.0.0,>=5.0.1->reaction-utils) (3.0.9)
Requirement already satisfied: charset-normalizer<3,>=2 in /home/andres/anaconda3/lib/python3.9/site-packages (from requests->metaflow<3.0.0,>=2.6.3->reaction-utils) (2.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /home/andres/anaconda3/lib/python3.9/site-packages (from requests->metaflow<3.0.0,>=2.6.3->reaction-utils) (2022.9.14)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from requests->metaflow<3.0.0,>=2.6.3->reaction-utils) (1.26.11)
Requirement already satisfied: idna<4,>=2.5 in /home/andres/anaconda3/lib/python3.9/site-packages (from requests->metaflow<3.0.0,>=2.6.3->reaction-utils) (3.3)
Requirement already satisfied: botocore<1.28.0,>=1.27.28 in /home/andres/anaconda3/lib/python3.9/site-packages (from boto3->metaflow<3.0.0,>=2.6.3->reaction-utils) (1.27.28)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from boto3->metaflow<3.0.0,>=2.6.3->reaction-utils) (0.10.0)
Requirement already satisfied: s3transfer<0.7.0,>=0.6.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from boto3->metaflow<3.0.0,>=2.6.3->reaction-utils) (0.6.0)
Requirement already satisfied: click in /home/andres/anaconda3/lib/python3.9/site-packages (from cli-exit-tools->wrapt-timeout-decorator<2.0.0,>=1.3.12->reaction-utils) (8.0.4)
Collecting dill
  Using cached dill-0.3.6-py3-none-any.whl (110 kB)
Requirement already satisfied: debugpy>=1.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipykernel>=4.5.1->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (1.5.1)
Requirement already satisfied: jupyter-client>=6.1.12 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipykernel>=4.5.1->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (7.3.4)
Requirement already satisfied: nest-asyncio in /home/andres/anaconda3/lib/python3.9/site-packages (from ipykernel>=4.5.1->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (1.5.5)
Requirement already satisfied: tornado>=6.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipykernel>=4.5.1->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (6.1)
Requirement already satisfied: matplotlib-inline>=0.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipykernel>=4.5.1->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.1.6)
Requirement already satisfied: pyzmq>=17 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipykernel>=4.5.1->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (23.2.0)
Requirement already satisfied: pexpect>4.3 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipython>=4.0.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (4.8.0)
Requirement already satisfied: setuptools>=18.5 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipython>=4.0.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (63.4.1)
Requirement already satisfied: backcall in /home/andres/anaconda3/lib/python3.9/site-packages (from ipython>=4.0.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.2.0)
Requirement already satisfied: pickleshare in /home/andres/anaconda3/lib/python3.9/site-packages (from ipython>=4.0.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.7.5)
Requirement already satisfied: jedi>=0.16 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipython>=4.0.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.18.1)
Requirement already satisfied: decorator in /home/andres/anaconda3/lib/python3.9/site-packages (from ipython>=4.0.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (5.1.1)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from ipython>=4.0.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (3.0.20)
Requirement already satisfied: jupyter_core in /home/andres/anaconda3/lib/python3.9/site-packages (from nbformat>=4.2.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (4.11.1)
Requirement already satisfied: fastjsonschema in /home/andres/anaconda3/lib/python3.9/site-packages (from nbformat>=4.2.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (2.16.2)
Requirement already satisfied: jsonschema>=2.6 in /home/andres/anaconda3/lib/python3.9/site-packages (from nbformat>=4.2.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (4.16.0)
Requirement already satisfied: locket in /home/andres/anaconda3/lib/python3.9/site-packages (from partd>=0.3.10->dask[dataframe]>=2.10.0->swifter<2.0.0,>=1.0.9->reaction-utils) (1.0.0)
Requirement already satisfied: notebook>=4.4.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (6.4.12)
Requirement already satisfied: attrs>=17.4.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from jsonschema>=2.6->nbformat>=4.2.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (21.4.0)
Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from jsonschema>=2.6->nbformat>=4.2.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.18.0)
Requirement already satisfied: entrypoints in /home/andres/anaconda3/lib/python3.9/site-packages (from jupyter-client>=6.1.12->ipykernel>=4.5.1->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.4)
Requirement already satisfied: prometheus-client in /home/andres/anaconda3/lib/python3.9/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.14.1)
Requirement already satisfied: Send2Trash>=1.8.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (1.8.0)
Requirement already satisfied: terminado>=0.8.3 in /home/andres/anaconda3/lib/python3.9/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.13.1)
Requirement already satisfied: argon2-cffi in /home/andres/anaconda3/lib/python3.9/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (21.3.0)
Requirement already satisfied: nbconvert>=5 in /home/andres/anaconda3/lib/python3.9/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (6.4.4)
Requirement already satisfied: ptyprocess>=0.5 in /home/andres/anaconda3/lib/python3.9/site-packages (from pexpect>4.3->ipython>=4.0.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.7.0)
Requirement already satisfied: wcwidth in /home/andres/anaconda3/lib/python3.9/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=4.0.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.2.5)
Requirement already satisfied: pandocfilters>=1.4.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (1.5.0)
Requirement already satisfied: beautifulsoup4 in /home/andres/anaconda3/lib/python3.9/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (4.9.0)
Requirement already satisfied: jupyterlab-pygments in /home/andres/anaconda3/lib/python3.9/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.1.2)
Requirement already satisfied: mistune<2,>=0.8.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.8.4)
Requirement already satisfied: defusedxml in /home/andres/anaconda3/lib/python3.9/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.7.1)
Requirement already satisfied: nbclient<0.6.0,>=0.5.0 in /home/andres/anaconda3/lib/python3.9/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.5.13)
Requirement already satisfied: testpath in /home/andres/anaconda3/lib/python3.9/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (0.6.0)
Requirement already satisfied: argon2-cffi-bindings in /home/andres/anaconda3/lib/python3.9/site-packages (from argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (21.2.0)
Requirement already satisfied: cffi>=1.0.1 in /home/andres/anaconda3/lib/python3.9/site-packages (from argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (1.15.1)
Requirement already satisfied: soupsieve>1.2 in /home/andres/anaconda3/lib/python3.9/site-packages (from beautifulsoup4->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (2.3.1)
Requirement already satisfied: pycparser in /home/andres/anaconda3/lib/python3.9/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.0.0->swifter<2.0.0,>=1.0.9->reaction-utils) (2.21)
Building wheels for collected packages: swifter
  Building wheel for swifter (setup.py) ... done
  Created wheel for swifter: filename=swifter-1.3.4-py3-none-any.whl size=16300 sha256=60980d2d5962c2b0209c3a3db7324bc76b0a2e38e53da351191370b69c92c78d
  Stored in directory: /home/andres/.cache/pip/wheels/2b/5e/f2/3931524f702ffd03309e96d35ee2fbf9c61c27377511ee8d4c
Successfully built swifter
Installing collected packages: texttable, brotli, zipfile-deflate64, xxhash, pyzstd, PyYAML, pyppmd, pybcj, multivolumefile, dill, Deprecated, py7zr, multiprocess, metaflow, swifter, lib-detect-testenv, cli-exit-tools, wrapt-timeout-decorator, reaction-utils
  Attempting uninstall: PyYAML
    Found existing installation: PyYAML 6.0
    Uninstalling PyYAML-6.0:
      Successfully uninstalled PyYAML-6.0
  Attempting uninstall: dill
    Found existing installation: dill 0.3.4
    Uninstalling dill-0.3.4:
      Successfully uninstalled dill-0.3.4
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
conda-repo-cli 1.0.20 requires clyent==1.2.1, but you have clyent 1.2.2 which is incompatible.
conda-repo-cli 1.0.20 requires nbformat==5.4.0, but you have nbformat 5.5.0 which is incompatible.
conda-repo-cli 1.0.20 requires PyYAML==6.0, but you have pyyaml 5.4.1 which is incompatible.
Successfully installed Deprecated-1.2.13 PyYAML-5.4.1 brotli-1.0.9 cli-exit-tools-1.2.3.2 dill-0.3.6 lib-detect-testenv-2.0.3 metaflow-2.8.4 multiprocess-0.70.14 multivolumefile-0.2.3 py7zr-0.18.12 pybcj-1.0.1 pyppmd-0.18.3 pyzstd-0.15.7 reaction-utils-1.2.0 swifter-1.3.4 texttable-1.6.7 wrapt-timeout-decorator-1.3.12.2 xxhash-2.0.2 zipfile-deflate64-0.2.0
mkdir: cannot create directory ‘data/’: File exists
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17    0    17    0     0     50      0 --:--:-- --:--:-- --:--:--    50
100   276  100   276    0     0    252      0  0:00:01  0:00:01 --:--:--     0
100 19.0M  100 19.0M    0     0  8368k      0  0:00:02  0:00:02 --:--:-- 21.6M
Archive:  data/uspto50k.zip
warning:  stripped absolute path spec from /
mapname:  conversion of  failed
replace data/raw_val.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename: ^C
from utils import *

0. Relevant packages

RXNMapper

RXNMapper is a deep learning tool for calculating the atom mapping for any reaction. This open-source tool uses the attention weights produced by a pretrained transformer model, and shows remarkable performance on atom mapping, compared to other tools available. This is an excellent example of the posible uses of unsupervised learning in chemistry. See more details here.

RDChiral

RDChiral is a wrapper for RDKit’s functionalities for reaction handling, that improves stereochemistry handling. This package will allow us to extract reaction templates from a reaction dataset, which are a standard way of encoding transformation rules.

RDChiral then also lets us apply the reaction template to a target molecule, to discover the reactants that will afford the target molecule under the given transformation.

Learn more from the code and the paper.

1. Obtaining the atom mapping

To obtain the atom mapping of a reaction, you can go to this site and paste your reaction SMILES. The application will then show you the mapped reaction smiles, as well as some visualization options, including:

  • The atom mapping of the reaction: which atoms in the reactants correspond to each atom in the products.

  • The attention maps: What the underlying model is computing, that is the conection between each pair of tokens.

image.png

NOTE: This model is also accessible through a programming interface. For this, follow the instructions here.

TODO:

Our task for this notebook will be to highlight reactions in a dataset, that may have some selectivity issues.

First, let’s encode a generic aromatic substitution reaction as a reaction template.

gen_rxn = 'Br[CH3:1].[cH:2]1[cH:3][cH:4][cH:5][c:6]([CH3:7])[cH:8]1>>[CH3:1][c:2]1[cH:3][cH:4][cH:5][c:6]([CH3:7])[cH:8]1'
visualize_chemical_reaction(gen_rxn)

gen_template = extract_template(gen_rxn)
print(f'The reaction template looks like this: {gen_template}')

The reaction template looks like this: Br-[CH3;D1;+0:1].[cH;D2;+0:2]>>[CH3;D1;+0:1]-[c;H0;D3;+0:2]

Now, if we apply this template to the same set of reactants, we get multiple products.

products = apply_template(gen_template, 'CBr.Cc1ccccc1')
for p in products:
    display(p[0])

This highlights a potential selectivity issue in our reaction. What other reactions in USPTO have selectivity issues? This is important for model evaluation, sometimes models correctly predict a reaction type, but fail to predict the correct selectivity!

def remove_aam(smi):
    return canonicalize_smiles(re.sub(r"(?<=[^\*])(:\d+)]", "]", smi))
# Let's find all reactions in USPTO where this reaction template is valid

train_df, val_df, test_df = load_data()

# Get only reactants fro train set
train_reacts = train_df['reactants>reagents>production'].apply(lambda x: remove_aam(x.split('>>')[0]))
train_reacts
0        COC(=O)[C@H](CCCCNC(=O)OCc1ccccc1)NC(=O)Nc1cc(...
1        Nc1cccc2cnccc12.O=C(O)c1cc([N+](=O)[O-])c(Sc2c...
2        CCNCC.Cc1nc(-c2ccc(C=O)cc2)sc1COc1ccc([C@H](CC...
3        CC1(C)CCC(CN2CCN(c3ccc(C(=O)NS(=O)(=O)c4ccc(NC...
4        CCOc1ccc(Oc2ncnc3c2cnn3C2CCNCC2)c(F)c1.O=C(Cl)...
                               ...                        
40003    COC(=O)CCC(C(N)=O)N1Cc2c(OCc3ccc(CBr)cc3)cccc2...
40004                               COc1ccc2cc(Br)ccc2c1Br
40005                            CCC(O)CCCCCCC=CCCCCC(=O)O
40006    BrC(Br)(Br)Br.OCCCCCC1=C(c2ccc(O)cc2)CCCc2cc(O...
40007                      COC(=O)c1ccnc([N+](=O)[O-])c1.N
Name: reactants>reagents>production, Length: 40008, dtype: object
# Let's apply this template
for reacts in train_reacts:
    try:
        prod = apply_template(gen_template, reacts)
        if len(prod)>0:
            print('\nPotential selectivity issue found:')
            visualize_mols(reacts)
            print('Possible products:')
            for p in prod:
                display(p[0])
                
    except:
        continue

Potential selectivity issue found:
Possible products:

Potential selectivity issue found:
Possible products: